Evolutionary quantum game 



Roland Kay, Neil F. Johnson and Simon C. Benjamin 

Physics Department and Center for Quantum Computation, 
Clarendon Laboratory, Oxford University, Parks Road, Oxford, 0X1 3PU, U.K. 

(February 1, 2008) 



O 
O 

D 



> 

oo 

o 

O 
(N 
O 



Oh: 

a ■ 

a ■ 

^3 : 



X 



We present the first study of a dynamical quantum game. 
Each agent has a 'memory' of her performance over the pre- 
vious m timesteps, and her strategy can evolve in time. The 
game exhibits distinct regimes of optimality. For small m the 
classical game performs better, while for intermediate m the 
relative performance depends on whether the source of qubits 
is 'corrupt'. For large m, the quantum players dramatically 
outperform the classical players by 'freezing' the game into 
high-performing attractors in which evolution ceases. 



PACS Nos.: 03.67.-a, 02.50.Le, 03.65. Ta 



The new field of quantum games is attracting signifi- 
cant interest [0-^. We recently conjectured that novel 
features should arise for quantum games with > 3 
players. Benjamin and Hayden subsequently created 
a Prisoner's Dilemma-like game for TV = 3 with a high- 
payoff 'coherent quantum equilibrium' (CQE). Johnson 
[0 showed that this quantum advantage can become a 
disadvantage when the game's external qubit source is 
corrupted by a noisy 'demon'. So far, all studies have 
been restricted to static games. 

Here we introduce an iterated version of the game, 
where players (agents) may modify their strategies based 
on information from the past - i.e. they 'learn' from 
their mistakes This evolutionary game produces 

highly non-trivial dynamics in both quantum and clas- 
sical regimes, and represents the first step toward un- 
derstanding iterated games employing temporal quan- 
tum coherence. Agents are provided with the minimum 
resources necessary for adaptability - specifically, each 
agent posesses a measure of her past success through a 
parameter $„ whose value reflects the payouts from re- 
cent rounds of the game. The fixed 'memory' param- 
eter m effectively governs the number of rounds upon 
which $ depends (see later Eq. (1)), and turns out to be 
fundamental for deciding the relative superiority of the 
quantum and classical games. For small memory m, the 
classical game is superior (in the sense that the average 
payout per player is higher). For intermediate m, relative 
superiority is determined by the reliability of the exter- 
nal qubit source, while for large m the quantum game 
is radically superior due to evolutionary 'freezing' into a 
high-paying attractor. 
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FIG. 1. Mean total payout per turn as a function of the 
memory size m. Curve with crosses in boxes: classical game 
with input qubit source or 1. Curves with crosses: quantum 
games with input qubit source |0) and |1) respectively. Upper 
inset: Payoff table for our three-player game. Lower inset: 
circuit diagram showing flow of information in the general 
quantum game. When the J gates are removed, the game 
becomes classical. 



We briefly review the static version of the game 
The basic idea is to establish the flow of information in 
a game, and then investigate the consequences of quan- 
tizing this information. In the classical static game, each 
of the A'' = 3 players receives a bit in the zero state, and 
chooses either to 'flip', or 'not flip' this bit prior to re- 
turning it. The three bits are then measured, and the 
payoff to the players is determined by the table shown in 
Fig. 1. This payoff table is such that each player's best 
choice of strategy is simply to choose to 'flip' regardless of 
what any other player may be doing. Given that all play- 
ers realize this, the inevitable outcome is that all play- 
ers choose 'flip' - this is the game's 'dominant strategy 
equilibrium' (DSE). The final measured state is there- 
fore 111, and the corresponding payout to each player 
is just 2 points. This result arises despite the existence 
of cooperative strategies, such as each player choosing 
'flip' with probability 80%, which would yield a higher 
(expected) payoff to all players: the problem is that any 
given player is better off switching strategy to 'definitely 
flip' hence this cooperative strategy profile does not oc- 
cur. This is exactly the same defection problem seen in 
the famous game Prisoner's Dilemma. The quantization 
process involves: (a) generalizing each bit to a qubit 
a|0) -f (b) generalizing the moves available to the 
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players from 'flip'/'don't flip' to a set of quantum opera- 
tions, and (c) introducing 'gates' to entangle the qubits 

- specifically the entanghng gate is J = -7= (7®-^ + zF®"^) 

where F = &x- Without entanglement, the quantum 
game is trivially equivalent to the classical variant |^] 

- therefore we can make a correspondence between the 
quantum and classical games simply by removing the J 
gates. The lower inset in Fig.l shows the flow of informa- 
tion in the quantum game. In general the actions of the 
players in the quantum game will result in a final state 
which is a superposition. Measurement then collapses 
this state to one of the classical outcomes, and the payoff 
is then be determined from the table (see Fig. 1) as in 
the classical game. Among the conclusions of Ref. |^ was 
the discovery that the quantum players could escape the 
DSE that traps the classical players, and hence outper- 
form the classical players by a considerable factor. In Ref. 
it was shown that this quantum advantage depends 
on the reliability of the source of qubits: if the source is 
believed to be generating qubits in the |0} state, but is 
in fact generating qubits in state |1), then the classical 
game players will out-perform the quantum ones. 

Following the previous literature on iterated classical 
multi-player games we limit the classical moves to 
three options: 'definitely flip', 'definitely do not fiip' and 
'flip with probability 1/2'. These options arc denoted 
by setting p, the probability of leaving the input qubit 
unflipped, equal to 0,1 or 1/2 respectively. This sim- 
plification from the full (continuous) range of physically 
possible p values clarifies the analysis while retaining the 
basic structure of the game. There are now 3^ = 27 pos- 
sible profiles or 'configurations' (pi,p2,P3)- These yield 
ten 'classes' each containing C > 1 configurations which 
are equivalent under interchange of player label ^ . The 
upper table in Fig. 2 shows the average payoffs for each 
configuration class for the classical game. Given that 
the input is 0, the dominant strategy equilibrium corre- 
sponds to all players choosing p = 0, i.e. class (iv) in the 
table. Hence although the continuous-parameter p-space 
has been discretized to only three values, this description 
includes the fundamental dominant strategy equilibrium. 

To maintain a correspondence between the quantum 
and classical games, we also permit our quantum players 
just three different moves: /, ax^ and ■^{(Jx + '^z)- It was 
earlier shown that these moves can give rise to superior 
quantum performance for the static quantum game with 
reliable qubit source Moreover when the entangling 
J gates are removed, making the game classical, these 
three moves correctly correspond to our allowed classi- 
cal moves p — 1, 0, 1/2 (respectively). Therefore we label 
the three quantum moves with the same p notation. A 
fully comprehensive study of the quantum game would 
also examine the case where the players are permitted a 
larger set of moves, e.g. one that is closed under compo- 
sition. However the aim here is to make a first direct 



move-for-move comparison between the iterated quan- 
tum and classical games, and this necessitates restricting 
the larger set of quantum options. The resulting table 
(see Fig. 2) provides a simple quantum analog of the 
classical case. 
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Class 


p=0 


p=1/2 


p=1 


Average 


i) 




aaa(1/2)[1/2]1/2 




(1/2)[1/2] 


ii) 


a(21/4)[-17/4]1/2 


aa(3/4)[1/4]1/2 




(9/4)[-5/4] 


iii) 


aa(11/2)[-9/2]1/2 


a(3/2)[1/2]1 




(25/6)[-17/6] 


iv) 


aaa{2)[0]1 






(2)[0] 


V) 






aaa(0)[2]1 


(0)[2] 


vi) 


a(1)[1l1 




aa(-9)[9]0 


{-17/3)[19/3] 


vii) 


aa(9)[-9]0 




a(1)[1]1 


{19/3)[-17/3] 


viii] 


a(5)[-4]1/2 


a(0)[0]0 


a[-4][5]1/2 


(1/3)[1/3] 


ix) 




aa(1/4)[3/4]1/2 


a(-17/4)[21/4]1/2 


(-5/4)[9/4] 


X) 




a(1/2)[3/2]1 


aa(-9/2)[11/2]1/2 


{-17/6)[25/6] 



Quantum 



Class 


p=0 


p=1/2 


p=1 


Average 


i) 




aaa(-15/4)[19/4J1/2 




(-15/4)[19/4] 


ii) 


a(-15/4)[19/4]1/2 


aa(-15/4)[19/4]1/2 




(-15/4)[19/4] 


iii) 


aa(-7/2)[9/2l1/2 


a(3/2)[1/2]1 




(-11/6)[19/6] 


Iv) 


aaa(2)[0]1 






(2)[0] 


V) 






aaa(0)[2]1 


(0)[2] 


vi) 


a(1)[1I1 




aa(-9)[9]0 


(-17/3)[19/3] 


vll) 


aa(9)[-9]0 




a(1)[1]1 


(19/3)[-17/31 


viii) 


a(5)[-4]1/2 


a(9)[-9]0 


a(5)[-4]1/2 


(19/3)[-17/3] 


ix) 




aa(19/4)[-15/4]1/2 


a(1 9/4) [-15/4)1/2 


(19/4)[-15/41 


X) 




a(3/2)[1/2l1 


aa{-7/2)[9/2]1/2 


(-11/6)[19/61 



FIG. 2. Average payoffs for classical game (upper) and 
quantum game (lower). Agents are denoted by 'a', p cor- 
responds to the probability of not flipping the input qubit - 
for the quantum case, p is an operator (see text). Average 
payoffs for input qubit are shown as (...); those for input 1 
are shown as [...]; those for 50 : 50 mixture of input qubits are 
shown without parentheses. Final column gives the average 
payoff per player for a given input qubit. 

We assume that in both quantum and classical games, 
players are unable to communicate between themselves 
and hence cannot coordinate which player picks which 
strategy. In the quantum game, this is more critical since 
the CQE (i.e. the Nash equilibrium given by class (viii) 
in the lower table of Fig. 2) involves players using differ- 
ent p's. We will find that the evolutionary aspect of our 
iterated game allows the quantum players to find coop- 
erative solutions even in the absence of communication, 
provided that the memory m is not too small. 

The agents are allowed to evolve ('mutate') their 
strategies based on past success, according to the fol- 
lowing prescription. After timestep t — 1, each player 
holds a 'moving average' $m,t-i of her payoffs which is 
then updated at timestep t: 

TO— 1 1 

Wm.t = $m,t-l H A* (1) 

TO TO 

where %m.t is the updated moving average and Aj is the 
payoff to the player at turn t. The information about 
previous outcomes therefore has a 'half life' since the 
contribution of a given round's payoff falls exponentially 
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with successive rounds. When the player's moving aver- 
age falls below a certain threshold d, she mutates - she 
chooses one of the other two allowed strategies with a 
50 : 50 chance. After mutation, the moving average $m 
continues to be updated, but the player enters a 'trial pe- 
riod' of Y turns before considering mutating again. For 
simplicity, we set Y = m for the remainder of this paper. 
We also consider fixed external qubit source orientations, 
although each of these restrictions can be relaxed. We 
choose a mutation threshold d = 3.0 in order to be above 
the 2 point maximum possible payoff per player in the 
classical static game (see payoff table in Fig. 1). We per- 
formed runs with 10^ timesteps - this is sufficient for the 
resulting statistics to be stable to within a few percent. 

Figure 1 shows the results for the average payoff per 
turn for the three agents combined. The classical domi- 
nant strategy equilibrium (2, 2, 2) corresponds to a value 
6, while the CQE (5,9,5) and its permutations corre- 
spond to a value 19. Figure 3 shows the corresponding 
average lifetime of the agents before mutation. Figure 1 
shows that for memory to < 8, the quantum game players 
are worse off than the classical game players regardless 
of the input qubit source orientation (0 or 1). (The par- 
ticular orientation of the fixed source doesn't affect the 
results for the classical game.) The classical game play- 
ers are also less prone to mutation, as seen from Figure 
3. For 8 < TO < 25, the quantum game players do better 
than the classical players if the source is (as they be- 
lieve it is) but do worse if the source is 1 (i.e. the source 
is 'corrupt' j^]). The same is qualitatively true for the 
lifetimes (Fig. 3) but for a smaller range 8 < m < 19. 
For TO > 25, the quantum game players achieve better 
average payoffs than the classical game players irrespec- 
tive of the source orientation, but those with the reliable 
(i.e. non-corrupt) source do significantly better. They 
eventually 'freeze' into a cooperative state, as explained 
later. 



Mean gene lifetimes over 10,000,000 iterations. 
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FIG. 3. Average time agents maintain a given strategy be- 
fore mutating, as a function of the memory m. Inset: Results 
for a larger y-scale, illustrating the 'freezing' of the dynamics 
in the quantum game with a reliable qubit source (s — 0). 



Figure 1 reveals an interesting effect within the classi- 
cal game: for to > 1 the classical game players do better 
than the classical dominant strategy equilibrium value 
of 2 -I- 2 -I- 2 = 6. If we had set the threshold d < 2, 
the players would quickly 'freeze' into the configuration 
p = 0, 0, 0, which guarantees 2 points per player per turn. 
By adopting a higher satisfaction limit, i.e. by being 
more 'greedy', the classical players introduce instability 
into the game - although this implies that in any given 
round at least one player is typically worse off, in the 
long term the instability actually benefits all the players. 

The explanation for the interesting large to behavior 
lies in the cycles through which the game moves on short 
and long timescales. In general, the behavior of both 
classical and quantum systems for larger to is dominated 
by a small number of possible attractors which trap the 
system for a certain period, before it jumps into a brief 
period of turbulence and then ends up in another attrac- 
tor. 




FIG. 4. Schematic game dynamics showing the attractors 
which dominate both the classical and quantum games in the 
large m limit, (a) The classical game, and (b) the quantum 
game with a reliable source (s = 0). Ellipses represent spe- 
cific strategy profiles and are labeled by the p values adopted 
by the three players. The circle labeled T represents the set 
of all other profiles, which are visited only during brief tur- 
bulent periods in the game's dynamics. Arrows indicate the 
probability that one profile follows another over a period of 
m rounds. Data are collected over lO'^ rounds with m — 30. 
In the classical game the profile (pi = 0,p2 ~ 0,P3 — 0), cor- 
responding to the dominant strategy equilibrium, occurs in 
each of the three attractor cycles. 

Figure 4 shows the structure of the attractors in the 
classical and quantum cases. In the classical game, the 
attractors correspond to two players fixing their strategy 
as 'definitely fiip' {p — 0). While they maintain these 
choices, the third player can never achieve satisfactory 
payouts, hence the short timescale dynamics correspond 
to this third player jumping around the three possible p 
values, constantly mutating. Eventually one of the other 
two players will encounter a run of losses and mutate her 
strategy. The game then enters a (poorly performing) 
turbulent period before settling back into either the pre- 
vious attractor or one of the 2 permutation-equivalent 
cycles. In the quantum case, however, these three short 
cycles are replaced by six attractors, each a single config- 
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uration such as p = 0, 1, 1/2. Because all such attractors 
yield satisfactory expected payoff to all players, the sys- 
tem will remain in an attractor until one player has an 
exceptionally long run of losses. Figure 3 indicates that 
the probability of this occurence falls exponentially with 
771 - i.e. the system 'freezes' into one configuration profile. 

In summary, we have presented the first results for 
an evolutionary quantum game. The dynamics are non- 
trivial, even in the present case where the game's history 
is assumed to be classical in that a measurement is taken 
at each timestep. The next logical step is to to allow for 
a superposition (or 'multiverse') of histories over R > 1 
turns. Because the coherence in the present game need 
only be maintained for R = 1 turns, the present game 
could be implemented as an algorthim in an elementary 
quantum computer containing 3 or more qubits - such 
an experiment could be performed with existing NMR 
technologies. More generally, it is not inconceivable that 
such 'games' are already being played at some micro- 
scopic level in the physical world - indeed, it has recently 
been shown that nature plays classical dominant-strategy 
games using clones of a virus that infects bacteria JTot. 
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